Model Selection

High Accuracy Transcription

# High Accuracy Transcription

Parakeet Tdt Ctc 0.6b Ja

This model is a Japanese automatic speech recognition (ASR) model based on the FastConformer architecture, developed by NVIDIA and converted to MLX format.

Speech Recognition

Stt Ru Fastconformer Hybrid Large Pc Onnx

NVIDIA FastConformer-Hybrid Large is a Russian automatic speech recognition model based on the FastConformer architecture, supporting CTC and RNN-T decoders.

Speech Recognition

Whisper Custom Small

A small speech recognition model based on the OpenAI Whisper architecture, focused on English speech-to-text tasks.

Speech Recognition English

Whisper Large V3 Turbo Shqip

An Albanian-optimized speech recognition model based on OpenAI Whisper Large v3 Turbo, supporting standard Albanian and Gheg dialect

Speech Recognition

Transformers Other

Voice Clone Large Finetune Final

This model is a voice cloning model fine-tuned based on openai/whisper-large-v3, primarily used for speech recognition tasks, achieving a word error rate of 15.3572 on the evaluation set.

Speech Recognition

Whisper Large V3 Gguf

Whisper is a multilingual automatic speech recognition (ASR) system that supports speech-to-text tasks in multiple languages.

Speech Recognition Supports Multiple Languages

Belle Whisper Large V3 Zh

A Chinese speech recognition model fine-tuned and optimized based on whisper-large-v3, showing significant performance improvements in multiple Chinese speech benchmarks

Speech Recognition

Stt Fa Fastconformer Hybrid Large

This is a hybrid model for Persian Automatic Speech Recognition (ASR), combining transducer and CTC decoder losses, optimized based on the FastConformer architecture.

Speech Recognition Other

Whisper Large V3 German

A fine-tuned German speech recognition model based on Whisper Large v3, optimized for German speech processing and recognition

Speech Recognition

Transformers German

Wav2vec2 Base 960h

ONNX format conversion of Facebook's wav2vec2-base-960h model, designed for Transformers.js, supporting browser-side speech recognition

Speech Recognition

Wav2vec2 Large Xlsr 53 English

Large-scale speech recognition model based on the wav2vec 2.0 architecture, supporting English speech-to-text conversion

Speech Recognition

Faster Whisper Large V2 Mix Jp

This is the CTranslate2 converted version of the whisper-large-v2-mix-jp model, suitable for Japanese speech recognition tasks

Speech Recognition Japanese

ASCEND Dataset Model

A fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m, trained on the ASCEND dataset

Speech Recognition

Wav2vec2 Base Libir Zenodo

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base-960h on an unknown dataset, primarily used for automatic speech recognition tasks.

Speech Recognition

Wav2vec2 Gujarati Stt

This is a Gujarati speech recognition model based on the Wav2Vec2 architecture, capable of directly converting Gujarati speech into text.

Speech Recognition

Wav2vec2 Gpt2 Wandb Grid Search

Automatic Speech Recognition (ASR) model trained on the LibriSpeech dataset

Speech Recognition

Wav2vec2 Kannada Stt

A Kannada speech recognition model based on the Wav2Vec2 architecture, capable of directly converting Kannada speech into text.

Speech Recognition

Wav2vec2 Urdu Stt

This is a Urdu speech recognition model based on the Wav2Vec2 architecture, capable of converting Urdu speech into text.

Speech Recognition

Wsj0 Full Supervised

This model is a speech recognition model fine-tuned on the WSJ0 dataset based on facebook/wav2vec2-large-lv60, achieving a word error rate of 0.0343 on the evaluation set.

Speech Recognition

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase